# llama.cpp compatibility
Minicpm4 8B Q8 0 GGUF
Apache-2.0
MiniCPM4-8B-Q8_0-GGUF is a model converted from openbmb/MiniCPM4-8B to GGUF format via llama.cpp, suitable for local inference.
Large Language Model
Transformers Supports Multiple Languages

M
AyyYOO
160
2
Infly Inf O1 Pi0 GGUF
A quantized version based on the infly/inf-o1-pi0 model, supporting multilingual text generation tasks, optimized with llama.cpp's imatrix quantization.
Large Language Model Supports Multiple Languages
I
bartowski
301
1
Qwen Qwen3 32B GGUF
Apache-2.0
Quantized version based on Qwen/Qwen3-32B, using llama.cpp for quantization, supporting multiple quantization types for different hardware requirements.
Large Language Model
Q
bartowski
49.13k
35
Rombo Org Rombo LLM V3.1 QWQ 32b GGUF
Apache-2.0
Rombo-LLM-V3.1-QWQ-32b is a 32B-parameter large language model, processed with llama.cpp's imatrix quantization, offering multiple quantization versions to accommodate different hardware requirements.
Large Language Model
R
bartowski
2,132
5
Llama3 8B 1.58 100B Tokens GGUF
A GGUF format model converted from Meta-Llama-3-8B-Instruct and HF1BitLLM/Llama3-8B-1.58-100B-tokens models, suitable for llama.cpp inference
Large Language Model
Transformers

L
brunopio
2,035
16
Featured Recommended AI Models